feat(remote): Add checksum path filters for download and extract tasks (fixes #68).#113
Conversation
WalkthroughThe tar and zip extraction tasks now accept Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant TaskRunner
participant Downloader
participant Extractor
participant Checksum
Client->>TaskRunner: run download-and-extract-(zip|tar) with CHECKSUM_*_PATTERNS
TaskRunner->>Downloader: fetch archive
Downloader-->>TaskRunner: archive
TaskRunner->>Extractor: extract archive (pass EXCLUDE_PATTERNS)
Extractor-->>TaskRunner: extracted files
TaskRunner->>Checksum: compute/validate (pass CHECKSUM_INCLUDE_PATTERNS, CHECKSUM_EXCLUDE_PATTERNS)
Checksum-->>TaskRunner: checksum result / skip decision
TaskRunner-->>Client: task result (skipped or updated)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
junhaoliao
left a comment
There was a problem hiding this comment.
let's refer to #69 and add tests to cover those parameters
the title should include (resolves #68).
junhaoliao
left a comment
There was a problem hiding this comment.
waiting for tests to be ported over then this is good to go
junhaoliao
left a comment
There was a problem hiding this comment.
the rest lgtm. just docstring issues
| # @param {string[]} [CHECKSUM_EXCLUDE_PATTERNS=[]] Path wildcard patterns, relative to any | ||
| # `CHECKSUM_INCLUDE_PATTERNS`, to exclude from the checksum. |
There was a problem hiding this comment.
the "relative to any CHECKSUM_INCLUDE_PATTERNS" wording seems ambiguous. since whether CHECKSUM_INCLUDE_PATTERNS is specified or not, those patterns will be excluded, i think it's fine to remove the wording to avoid confusion?
| # @param {string[]} [CHECKSUM_EXCLUDE_PATTERNS=[]] Path wildcard patterns, relative to any | |
| # `CHECKSUM_INCLUDE_PATTERNS`, to exclude from the checksum. | |
| # @param {string[]} [CHECKSUM_EXCLUDE_PATTERNS=[]] Path wildcard patterns to exclude from the | |
| # checksum computation. |
There was a problem hiding this comment.
we should follow the EXCLUDE_PATTERNS docstring for checksum:compute and checksum:validate.
How about:
# @param {string[]} [CHECKSUM_EXCLUDE_PATTERNS=[]] Path wildcard patterns, relative to any `CHECKSUM_INCLUDE_PATTERNS` (or OUTPUT_DIR by default), to exclude from the checksum.
There was a problem hiding this comment.
how about
# @param {string[]} [CHECKSUM_EXCLUDE_PATTERNS=[]] Path wildcard patterns to omit from the
# checksum. Each pattern applies to all paths selected for checksum computation,
# whether the paths are from `INCLUDE_PATTERNS` or `OUTPUT_DIR`.
cc @davidlion
| # @param {string[]} [CHECKSUM_EXCLUDE_PATTERNS=[]] Path wildcard patterns, relative to any | ||
| # `CHECKSUM_INCLUDE_PATTERNS`, to exclude from the checksum. |
There was a problem hiding this comment.
| # @param {string[]} [CHECKSUM_EXCLUDE_PATTERNS=[]] Path wildcard patterns, relative to any | |
| # `CHECKSUM_INCLUDE_PATTERNS`, to exclude from the checksum. | |
| # @param {string[]} [CHECKSUM_EXCLUDE_PATTERNS=[]] Path wildcard patterns to exclude from the | |
| # checksum computation. |
Co-authored-by: Junhao Liao <junhao@junhao.ca>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@taskfiles/remote/tests.yaml`:
- Around line 163-165: CHECKSUM_EXCLUDE_PATTERNS currently contains hardcoded
fixture paths (".github/CODEOWNERS", ".github/PULL_REQUEST_TEMPLATE.md");
replace these literal strings with the shared extracted-path variables already
defined in the repo (use the existing extracted-path vars rather than
hardcoding) so the patterns reference those variables in
taskfiles/remote/tests.yaml (update the CHECKSUM_EXCLUDE_PATTERNS array to use
the shared vars and ensure those vars are imported/available in the same scope).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 869e3d2d-c4ac-4862-8ea2-08ec52815e48
📒 Files selected for processing (1)
taskfiles/remote/tests.yaml
There was a problem hiding this comment.
♻️ Duplicate comments (2)
exports/taskfiles/utils/checksum.yaml (1)
13-14:⚠️ Potential issue | 🟡 MinorClarify
EXCLUDE_PATTERNSsemantics to avoid implying root-anchored matches.Line 13 and Line 78 currently describe excludes as matching “full paths rooted at
INCLUDE_PATTERNS”, but matching is broader in practice (non-anchored wildcard matching under the include set). This wording can cause incorrect pattern authoring.Suggested doc wording
- # `@param` {string[]} [EXCLUDE_PATTERNS=[]] Path wildcard patterns to exclude from checksum - # computation, applied as glob-style matches to full paths rooted at the `INCLUDE_PATTERNS`. + # `@param` {string[]} [EXCLUDE_PATTERNS=[]] Path wildcard patterns to exclude from checksum + # computation, matched against paths within the `INCLUDE_PATTERNS` scope.Based on learnings:
CHECKSUM_EXCLUDE_PATTERNSvalues are expected relative to the extracted content root (for example,.github/CODEOWNERS), without the top-level extracted directory prefix.Also applies to: 78-79
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@exports/taskfiles/utils/checksum.yaml` around lines 13 - 14, Update the documentation wording for EXCLUDE_PATTERNS (and CHECKSUM_EXCLUDE_PATTERNS) to state that exclude patterns are non-anchored glob patterns applied relative to the extracted content root (e.g., ".github/CODEOWNERS") and match anywhere under the included set, not as full paths rooted at the INCLUDE_PATTERNS prefix; replace the phrase “full paths rooted at the `INCLUDE_PATTERNS`” with a clarification that patterns are evaluated against paths relative to the extracted content root and may match any descendant path under the included patterns.exports/taskfiles/utils/remote.yaml (1)
63-67:⚠️ Potential issue | 🟡 MinorMirror the same docstring clarification here for checksum exclude patterns.
Line 63 and Line 153 repeat the same “full paths rooted at
CHECKSUM_INCLUDE_PATTERNS” phrasing, which can mislead callers about how to author excludes.Based on learnings: in remote tests,
CHECKSUM_EXCLUDE_PATTERNSentries are relative to extracted content root (for example,.github/CODEOWNERS) and are not interchangeable with prefixed extracted-directory paths.Also applies to: 153-157
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@exports/taskfiles/utils/remote.yaml` around lines 63 - 67, Update the docstring for CHECKSUM_EXCLUDE_PATTERNS to mirror the clarification added for includes: state that exclude patterns are matched against paths relative to the extracted content root (e.g., ".github/CODEOWNERS") and are not prefixed by the extracted output directory or interchangeable with CHECKSUM_INCLUDE_PATTERNS roots; update both occurrences that currently say “full paths rooted at the `CHECKSUM_INCLUDE_PATTERNS`” to this clarified wording and keep the existing example of CHECKSUM_INCLUDE_PATTERNS ({{.OUTPUT_DIR}}) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@exports/taskfiles/utils/checksum.yaml`:
- Around line 13-14: Update the documentation wording for EXCLUDE_PATTERNS (and
CHECKSUM_EXCLUDE_PATTERNS) to state that exclude patterns are non-anchored glob
patterns applied relative to the extracted content root (e.g.,
".github/CODEOWNERS") and match anywhere under the included set, not as full
paths rooted at the INCLUDE_PATTERNS prefix; replace the phrase “full paths
rooted at the `INCLUDE_PATTERNS`” with a clarification that patterns are
evaluated against paths relative to the extracted content root and may match any
descendant path under the included patterns.
In `@exports/taskfiles/utils/remote.yaml`:
- Around line 63-67: Update the docstring for CHECKSUM_EXCLUDE_PATTERNS to
mirror the clarification added for includes: state that exclude patterns are
matched against paths relative to the extracted content root (e.g.,
".github/CODEOWNERS") and are not prefixed by the extracted output directory or
interchangeable with CHECKSUM_INCLUDE_PATTERNS roots; update both occurrences
that currently say “full paths rooted at the `CHECKSUM_INCLUDE_PATTERNS`” to
this clarified wording and keep the existing example of
CHECKSUM_INCLUDE_PATTERNS ({{.OUTPUT_DIR}}) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6340ee8c-487f-4734-a2fa-40087fe422c1
📒 Files selected for processing (2)
exports/taskfiles/utils/checksum.yamlexports/taskfiles/utils/remote.yaml
Description
(closes #69 )
For
npmpackages, sometimes we only want to checksum a subset of the downloaded source, or ignorein-place updates in known locations such as
node_modules.node_modulesshould live inside the source tree because package managers likenpmandYarninstall relative to the project root, and many tools assume this layout.However, these updates can cause checksum drift. Adding
CHECKSUM_EXCLUDE_PATTERNSto download and extract tasks is the most direct and practical way to handle this. Beyondnode_modules, it can filter out build outputs, caches, logs, generated docs, and other nonessential files that do not affect the core content of the source tree.Alongside this, we introduce
CHECKSUM_INCLUDE_PATTERNSfor completeness, but it is more narrow in scope. It is useful when only a small, well defined portion of a larger directory is the actual artifact of interest and the rest is generated. Currently, it defaults to the entire extraction output directory, preserving the original behavior before this PR.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Tests
Documentation